-
Notifications
You must be signed in to change notification settings - Fork 28
Add Docker Model Runner–Ready Compose Configuration #194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Docker Model Runner–Ready Compose Configuration #194
Conversation
This is a Docker Compose file preconfigured for Docker Model Runner (DMR). The YAML includes direct links to official Docker documentation to help users install and enable the docker-model-plugin required for local model execution. The setup uses the lightweight ai/qwen3:0.6B-Q4_0 model (~500 MB), chosen for its small footprint, ability to run on integrated graphics, and reliable performance—making it ideal for testing and low-resource environments. Additionally, /app/temp_audio is now mounted as a tmpfs volume, keeping temporary audio data entirely in RAM. This reduces disk I/O, improves processing speed, and is generally safe given the small size of audio files and the assumption that users running local models typically have sufficient memory.
The README.md has been updated to reference the new DMR-enabled Docker Compose file, helping users easily discover and enable the local model runner setup.
|
I tried changing the
so I am not really understanding why it keeps giving |
|
check ai.py, lines 134-158, under header "Extract content based on format". Below a potential solution, but I can't test. Potential reason: The issue arises because DMR (using a llama.cpp backend) and OpenRouter structure their streaming JSON chunks differently. While OpenRouter typically omits the content key entirely when it is empty, DMR explicitly sends "content": null—particularly while the model is generating reasoning_content (the "Thinking" phase). The code's check, if 'content' in choice['delta'], evaluated to True for DMR because the key existed, causing the script to crash when it attempted to concatenate that None value to the string. |
|
I applied your changes to the |
|
That's great! I've included in my PR #196 where applied a fix in the OpenRouter rate limit |
I created a new pull request for DMR.
This adds Docker Compose file preconfigured for Docker Model Runner (DMR). The YAML includes direct links to official Docker documentation to guide users through installing and enabling the docker-model-plugin, making it easier to run AI models locally using an OpenAI-compatible API.
The configuration uses the lightweight
ai/qwen3:0.6B-Q4_0model (~500 MB), chosen for its small resource footprint, ability to run on integrated graphics, and surprisingly reliable performance—ideal for testing or low-power systems.From my personal experience, a tmpfs mount for /app/temp_audio is far better that docker volume in this case; ensuring temporary audio files stay entirely in RAM. This significantly reduces disk I/O and improves workflow responsiveness, and is safe given the small size of typical audio snippets. I don't think audio files can be larger than 1gb.
Finally, the README.md has been updated to reference the new DMR-enabled Docker Compose file, helping users easily discover and enable the local model runner setup.
I tried DMR with the Open AI integration and I am getting this error using Instant Playlist:
Logs had this to say:
Using the configurations I added to Docker compose and bashing into the container to run this I was able to get an answer from DMR:
I tested the function under
ai.pyand this was the output:I am not sure if it is timing out or simply not extracting correctly.